docs: clarify auto-variadic socket input ordering in connect()#11053
docs: clarify auto-variadic socket input ordering in connect()#11053sjrl merged 4 commits intodeepset-ai:mainfrom
Conversation
|
@saivedant169 is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
|
@davidsbatista I can take the review of this one |
Document the input ordering behavior of auto-promoted lazy variadic sockets in Pipeline.connect() and PipelineBase._make_socket_auto_variadic(). When multiple senders are connected to the same list-typed receiver socket, the items in the resulting list are ordered alphabetically by sender component name (because Pipeline.run() schedules components in alphabetical order for deterministic execution), not by the order in which connect() was called. The docstrings now point users to a dedicated joiner component when they need explicit ordering. fixes deepset-ai#10979
Applied sjrl's review feedback. The previous text was wrong about DocumentJoiner (it behaves the same as a promoted variadic socket) and it missed the difference between Pipeline and AsyncPipeline. Pipeline schedules components alphabetically, so the resulting list is ordered by sender name. AsyncPipeline runs branches in parallel, so ordering is not guaranteed. Updated both the connect() docstring and the _make_socket_auto_variadic note, plus the release note.
e7d7ecd to
b390210
Compare
|
Thanks for catching that @sjrl. You're right, the DocumentJoiner bit was off and the Pipeline vs AsyncPipeline distinction was missing. I applied your wording in the connect() docstring and propagated the same clarification to the _make_socket_auto_variadic note and the release note so all three read consistently. |
Switch the inline literals from double to single backticks so they match how the rest of our api docstrings are written. Double backticks belong in the release notes, not here. Also drop the note block from the private _make_socket_auto_variadic helper. That method is not exposed on the docs site, so the extra context does not buy users anything. The public-facing explanation on connect() is enough.
|
Thanks for the pointers. Switched the inline literals in connect() over to single backticks and dropped the note block from _make_socket_auto_variadic since that method isn't on the docs site. Let me know if anything else stands out. |
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Related Issues
Proposed Changes:
@julian-risch and @sjrl discussed this in #10979 and went with Option A: keep the current behavior but document it better.
Updated two docstrings in
haystack/core/pipeline/base.py:Pipeline.connect()now mentions that when multiple senders are connected to the same list-typed receiver socket, the items in the resulting list end up ordered alphabetically by sender component name, not by the orderconnect()was called. Users who need a specific order should use a dedicated joiner component likeDocumentJoiner._make_socket_auto_variadic()got the same note plus a sentence explaining the underlying reason:Pipeline.run()schedules components in alphabetical order so that execution stays deterministic and independent of pipeline insertion order.Also added a release note under
releasenotes/notes/.How did you test it?
This is a docs-only change. No new tests needed. Verified the file still parses cleanly with ruff.
Notes for the reviewer
I couldn't run
hatch run fmtlocally because hatch wasn't installed in my environment, but I ran ruff directly on the modified file and it passes (ruff checkandruff format --checkboth clean).Happy to tweak the wording if you want it phrased differently.
Checklist
docs: